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1.  Introduction 

This  final  report  represents  the  first  year  effort  in  a  projected  three  year  research  effort  into  the  continuation 
of  operation  in  distributed  tactical  information  systems.  The  principle  investigator  is  Dr.  Dcwaync  F.  Perry 
of  Pegasus  Systems  and  the  work  is  sponsored  by  the  Army  Research  Office  under  contract  DAAG29-82- 
C-0014  for  the  period  1  June,  1982  through  31  May  1983. 

2.  Research  Goals 

In  this  research  project,  we  arc  investigating  the  techniques  and  tools  required  to  build  reliable  systems  that 
provide  continuity  of  operations  despite  component  failure.  Specifically,  we  arc  interested  in  software 
solutions  to  the  problems  on  reliability,  recovery  and  reconfiguration. 

The  basic  goals  of  this  research,  over  the  three  year  effort  arc  summarized  as  follows: 

•  To  identify  and  refine  the  fundamental  issues  and  the  relationships  between  these  issues  in  the 
construction  or  survivablc  distributed  systems; 

•  To  survey  existing  approaches  to  these  various  issues,  to  determine  which  are  suitable  for 
survivable  systems,  and  to  explore  their  interaction  with  each  other, 

•  To  determine  in  levels  of  performance  by  there  various  design  and  implementation  strategies;  and 

•  To  establish  and  design  approach  applicable  to  survivablc  distributed  tactical  information  systems. 

The  basic  underlying  issue  is  the  extent  to  which  the  goal  of  survivability  places  special  requirements  upon 
the  various  aspects  of  system  design  and  construction.  Are  there  existing  techniques  that  are  satisfactory 
within  the  constraints  of  acceptable  system  performance  or  must  new  techniques  be  developed  to  solve  the 
particular  needs  of  continious  operation? 

The  goals  of  this  research  is  explored  though  the  use  of  one  or  more  models  of  actual  tactical  information 
systems.  These  models  are  designed  in  various  stages  in  order  to  folly  understand  the  various  solution 
strategies,  their  implications  and  their  cost.  While  these  models  represent  "real"  systems  the  emphases  is 
upon  computer  science  issues  rather  than  on  application  related  issues. 

The  Ada  language  will  be  used  to  specify,  design  and  construct  these  various  system  models  in  sufficient 
depth  to  expose  the  relevant  probclms  of  the  design  and,  where  they  occur,  the  inabilities  of  Ada  to  handle 
the  problems  of  distributed  system  design  andconstruodon. 

The  goal  for  this  first  year  of  effort  are  as  follows: 

•  define  a  model  system  abstract  from  a  "real"  field  situation; 
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•  within  the  context  of  this  model,  identify  the  simplifying  assumption  arc  to  be  made  in  order  to 
restrict  the  complexity  of  the  problem  and  control  the  interaction  between  various  issues;  and 

•  address  the  issues  of  redundancy,  locus  on  control  and  reconfiguration. 

The  effort  with  respect  to  the  third  goal  has  centered  primarily  on  issues  of  recovery. 

3.  Summary  of  Results 

First,  the  system  model  has  been  defined  as  an  abstraction  or  a  battlefield  information  system:  a  set  of 
battalions  with  a  set  of  forward  observers.  This  model  has  alt  the  requirements  needed  for  a  framework  for 
considering  the  survivability  of  a  distributed  information  system: 

•  distributed  processors  (the  battalions  of  the  model); 

•  distributed  resources  (the  forward  observers);  and 

•  distributed  information  (the  battalion  database).  ‘ 

A  discretion  of  the  model  is  to  be  found  below. 

Second,  the  simpifying  assumptions  have  been  dclinated  that  restrict  the  focus  of  the  investigation  to  the 
appropriate  level  of  detail  and  that  support  the  issues  under  investigation.  These  assumptions  have  been 
made  in  order  to  concentrate  our  attention  on  the  problems  of  recovery  and  reconfiguration,  in  response  to 
component  failure.  These  assumptions  are  presented  with  the  system  model  in  the  discussion  below. 

Finally,  we  have  focused  our  attention  specifically  in  the  problem  of  global  system  recovery  and  present  a 
methodology  for  determining,  and  recovering  the  system  to,  the  most  recent  consistent  global  state.  An 
example  common  to  recovery  and  reconfiguration,  that  of  transferring  databases  for  backup  and 
repartitioning  purposes,  is  given  to  illustrate  the  mcthodolgy. 

4.  Research  Results 

The  results  of  this  research  are  discussed  in  th<  -  -  xt  two  subsections.  First,  we  present  the  sysem  model  and 
accompanying  simplifying  assumptions.  So..  ■».*:  *c  discuss  the  problems  of  global  recovery  and  present  a 
methodology  for  determining  global  recover  .  :  m  distributed,  cooperating  processes. 

4.1.  Tha  System  Modal 

The  system  model  described  below  is  .ifc'ir  .*  \  :  rom  a  "rear  military  system:  a  set  of  battalions  together 
with  a  set  of  forward  observers  which  serve  -  tc  input  sources;  the  output  of  the  systems  can  be  ignored 
with  loss  of  generality.  The  battalions  are  mdcpx  '-dent  components,  dispersed  geographically,  that  cooperate 
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with  each  other  in  gathering  tactical  information  and  that  pursue  individual  as  well  as  global  goals.  The 
forward  observers  serve  as  the  source  of  the  tactical  information. 

By  abstracting  away  the  application  dependent  aspects  of  the  system,  we  define  the  following  system  model 
for  our  research.  The  model  consists  of 

•  a  set  of  autonomous,  homogeneous  nodes, 

•  a  set  of  remote  input  sources, 

•  a  database  of  information,  and 

•  a  folly  connected  network. 

We  will  discuss  the  simplifying  assumptions  with  results  to  these  model  concerning  the  following  issues: 

•  node  behavior 

•  remote  input  behavior. 

•  database  characteristics 

•  communications 

•  local  storage 

•  redundancy 

•  recovery 

The  nodes  in  the  system  may  enter  and  leave  the  system.  It  may  be  use  useful  to  distinguish  two  kinds  of 
node  removal  from  the  system:  planned  and  unplanned.  In  the  first  ease,  the  system  can  a  priori  provide  the 
temporary  coverage  within  the  system  for  that  node,  (ie,  a  basic  minimun  reconfiguration  that  allows  for  node 
reentry  lata).  In  the  second  ease,  the  worst  case  is  assumed  and  the  system  recovers  by  reconfiguring  around 
the  missing  node.  Whether  reentry  into  the  system  causes  reconfiguration  is  application  dependent  -  we 
assume  it  docs  cause  reconfiguration. 

__  Remote  input  sources  behave  in  a  manner  similar  to  that  of  the  nodes  -  they  enter  and  leave  the  system. 
The  system  provides  a  mapping  of  remote  input  sources  into  nodes  that  is  rcconfigurablc  dependent  upon  the 
behavior  of  the  nodes  and  of  the  remote  input  sources.  Remove  input  sources  report  to  the  system  and  the 
system  provides  the  mapping  dynamically. 

We  assume  that  there  neither  malicious  nodes  nor  malicious  remote  input  sources.  Further,  we  assume  that 
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each  node  detects  its  own  failure  and  recovers  to  a  consistent  state.  Local  recovery  •  that  is,  without 
reconfiguration  -  maybe  possible  for  this  state  of  affairs. 

The  bask  goal  of  the  system  with  regard  to  the  data  base  is  no  loss  of  data  or  information.  For  the  sake  of 
simplkity,  we  assume  that  the  data  base  is  partitioned.  Each  node  is  responsible  for  its  own  data  base. 
Further,  the  mapping  of  remote  input  is  not  done  to  nodes  per  sc,  but  to  the  partitioning  of  the  data  base 
-  and  set  of  remote  input  sources  is  mapped  to  a  particular  partition  of  the  database.  Replication  of  the  data 
base  is  considered  from  the  perspective  of  backup  purpose  only.  The  rationale  for  this  approach  is  that  we  arc 
not  concerned  with  accessing  the  data  but  only  with  maintaining  the  complctncss  of  the  data.  Each  partition 
then  is  replicated  on  some  other  node. 

It  is  in  the  area  of  communications  that  we  make  the  stringent  assumptions  because  we  wish  to  concentrate 
on  the  higher  level  issues  rather  than  on  the  mechanics  of  the  connection  between  components.  First,  we 
assume  that  the  communkation  is  fault  free  -  either  the  communication  completes  correctly,  or  it  does  not 
complete  at  all  (and  this  latter  condition  occurs  only  if  the  destination  node  is  unreachable).  Second,  the 
network  is  adaptive  in  such  a  way  that  all  nodes  agree  as  to  the  faults  in  the  system  and  as  to  the  nodes 
connected  to  the  system.  Third,  notifkationoccurs  when  a  node  or  set  of  nodes  is  not  reachable.  Finally,  we 
assume  that  packet  radio  is  used  in  order  to  provide  the  basis  for  performance  determinations. 

Local  storage  for  each  node  is  assumed  to  be  stable  in  the  sense  of  [Lampson  76].  As  a  result  of  stable 
storage  and  fault-free  communkation  we  can  assume  there  are  no  consistency  problems  with  respect  to  the 
data.  The  only  problems  are  those  due  to  latency  of  data  between  nodes. 

As  previously  mentioned,  the  data  base  is  hilly  replicated  by  means  of  a  single  copy  backup.  In  order  to 
focus  our  attention  on  the  most  bask  interaction,  we  make  the  (extremely  simplifying)  assumption  that  the 
partitioning  of  the  data  base  is  such  that  the  source  and  backup  copies  will  not  both  be  lost  in  a  single 
recovery/rcconfiguration  cycle. 

For  recovery,  we  assume  that  only  one  node  will  disappear  at  a  time  and  that  there  is  suflkicnt  time 
between  disappearances  for  the  recovery  and  reconfiguration  to  take  place.  Later,  we  will  relax  this 
assumption.  Further,  we  assume  that  a  director  of  recovery  and  reconfiguration  is  selected  to  direct  the 
repartitioning  and  reconfiguration  of  die  system. 

With  these  assumptions,  we  have  the  bask  framework  to  investigate  the  issues  of  recovery  at  a  relatively 
high  level  in  order  to  examine  the  interactions  between  nodes. 
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4.2.  Recovery 

Some  of  the  basic  problems  that  occur  in  a  distributed  system  that  result  in  changes  to  the  system,  or  global 
state,  are  as  follows: 

•  a  node  foils, 

•  a  node  enters  the  system, 

•  the  interconnection  between  nodes  fails 

•  the  system  is  partitioned 

The  traditional  notion  of  recovery  -  returning  to  a  previously  consistent  state  -  is  not  entirely  applicable 
here.  It  may  be  impossible  to  return  the  system  state  to  a  previous  condition.  The  error  conditions  or  the 
causes  of  the  error  conditions  may  preclude  such  a  return.  It  is  the  very  disintegration  of  the  global  state  that 
may  be  the  cause  of  the  problem  in  a  distributed  system.  Particularly  since  the  connections  between 
components  and  the  components  themselves  that  comprise  the  global  state  may  foil. 

It  is  for  this  reason  *  the  disintegration  of  the  global  state  -  that  backward  recovery  may  not  be  a  viable 
recovery  mechanism  and  that  we  turn  our  attention  to  forward  recovery  considerations.  Recovery  in  a 
distributed  system  under  these  conditions  is  more  a  matter  of  reconfiguration  and  the  extablishing  of  a  new 
system  state.  The  basic  goal,  however,  is  still  to  maintain  a  consistent  global  state  with  regard  to  the  data  base. 

A  basic  part  of  the  recovery  by  reconfiguration  is  concerned  with  repartitioning  the  database.  When  a  node 
foils,  the  data  maintained  by  the  failed  node  needs  to  be  integrated  into  another  node  and  the  remote  input 
sources  remapped  to  that  node.  We  will  assume  for  the  time  being  that  the  partition  is  kept  intact,  primarily 
because  repartitioning  multiplies  the  amount  of  work  for  reconfiguration  and  we  want  to  look  at  the  simplest 
case  first 

One  of  the  major  concerns  in  this  reconfiguration  is  the  movement  of  a  new  backup  copy  of  the  database  to 
the  new  backup  node.  This  involves  the  participation  of  three  nodes:  the  initiator  node,  i.e.,  the  director  of 
the  recovery;  the  source  node  of  the  backup  file;  and  the  destination  node  where  the  partition  is  now  to 
reside,  either  as  a  backup  or  as  integrated  into  the  nodes  database.  The  following  represents  the  interaction 
between  the  nodes. 
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Hie  interaction  of  die  three  nodes  can  be  seen  in  the  following  state  description  of  the  three  nodes 
simultaneously.  Where  die  states  proceed  concurrently,  the  nodes  can  proceed  concurrently.  Where  there  is 
no  processing  indicated,  the  node  is  waiting  for  synchronization  on  the  next  non-empty  global  state.  The 
global  states  arc  separated  by  lines;  global  state  transitions  that  represent  global  recovery  points  are  indicated 
by  asterisks. 
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In  the  normal  case,  the  global  state  transition  occurs  in  parallel  with  interacting  processes  proceeding 
concurrently  on  the  basis  of  the  message  transmission  and  reception.  A  send  docs  not  complete  until  the 
message  has  been  received.  Non-interaction  pr<«. .  ,cs  proceed  directly  to  their  next  non-empty  global  state. 

If  a  connection  is  broken  and  then  resum.  '.  '  c  two  interacting  processes  must  determine  the  the  most 
recent  state  to  reinstate.  The  usual  case  is  to ;  a  ■  ^  r  back  to  the  earliest  of  the  two  processes  state  and  proceed 
from  there.  In  this  case,  however,  one  <<t  i.'-c  ;  <  esses  may  be  able  to  establish  a  more  recent  state  as  the 
effective  state  because  it  represents  the  swem  >  .  >bat  state  where  it  is  most  important  -  in  this  case,  with 
respect  to  die  database. 


When  connections  are  broken,  and  con  nee  nun  :•>  rcstablished,  there  are  several  possible  courses  of  action. 


First,  the  recovery  of  the  global  state  may  only  affect  the  two  processes  involved.  In  this  case,  no  further 
recovery  is  needed  but  that  needed  to  resolve  the  appropriate  state  of  the  system.  Second,  the  recovery  may 
affect  other  processes  involved  in  the  operation  and  thus  a  more  global  recovery  action  is  needed. 

For  example,  once  state  4  has  been  reached,  there  is  no  need  to  recover  the  cooperating  processes  to  a  point 
earlier  that  that  if  the  connection  between  the  Source  and  The  Destination  is  broken  unless  the  Source  node 
must  recover  to  a  state  prior  to  state  4.  If  the  Source  node  fails  and  recovers  to  global  state  8  but  the 
destination  process  is  state  10,  there  is  no  need  for  the  both  the  Source  and  the  Destination  nodca  to  recover 
back  to  state  8  when  state  10  will  maintain  the  consistency  of  the  recovery  plan  *  it  already  has  installed  the 
database. 

The  net  result  of  this  kind  of  analysis  is  that  recovery  can  move  forward  to  a  later  state  (that  is  the  most 
recent  global  recover  state),  rather  than  back  to  an  earlier  state,  under  certain  conditions. 

In  summary,  the  following  considerations  result  from  this  methodological  approach. 

•  There  are  identifiable  global  states  that  represent  either  the  completion  of  interaction  between  two 
cooperating  processes  (or  nodes)  or  that  represent  important  events  in  one  of  the  processes  that 
can  be  considered  the  point  event  (for  example,  the  reception  of  the  data  base,  or  the  installation 
of  the  database). 

•  An  important  question  is  how  to  represent  and  distinguish  local  and  global  states.  For  example,  a 
group  of  local  states  might  be  grouped  together  to  represent  an  important  global  state  with  respect 
to  the  other  cooperating  processes. 

•  Synchronization  points  are  expressed  by  message  exchanges  and  represent  important  transitions  in 
the  global  state. 

•  The  dichotomy  between  local  and  global  state  changes  can  be  expressed  in  parallel  for  the 
cooperating  processes.  Even  they  arc  distinct  events,  they  do  occur  in  parallel  and  can  be 
represented  that  way. 

4.3.  Summary 

The  preceding  discussion  represents  a  preliminary  investigation  into  the  problems  of  recovery  in  distributed 
systems.  A  system  model  has  been  constructed  and  restrictions  placed  on  it  to  narrow  the  focus  of  the 
research.  The  approach  to  determining  forward  recovery  points  is  an  initial  look  at  ways  of  reconstructing  the 
global  state  of  a  distributed  system  when' the  global  state  disintegrates  because  of  node  failure. 
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